Search CORE

46 research outputs found

Affect Recognition in Conversations Using Large Language Models

Author: Feng Shutong
Gašić Milica
Lubis Nurul
Sun Guangzhi
Zhang Chao
Publication venue
Publication date: 22/09/2023
Field of study

Affect recognition, encompassing emotions, moods, and feelings, plays a pivotal role in human communication. In the realm of conversational artificial intelligence (AI), the ability to discern and respond to human affective cues is a critical factor for creating engaging and empathetic interactions. This study delves into the capacity of large language models (LLMs) to recognise human affect in conversations, with a focus on both open-domain chit-chat dialogues and task-oriented dialogues. Leveraging three diverse datasets, namely IEMOCAP, EmoWOZ, and DAIC-WOZ, covering a spectrum of dialogues from casual conversations to clinical interviews, we evaluated and compared LLMs' performance in affect recognition. Our investigation explores the zero-shot and few-shot capabilities of LLMs through in-context learning (ICL) as well as their model capacities through task-specific fine-tuning. Additionally, this study takes into account the potential impact of automatic speech recognition (ASR) errors on LLM predictions. With this work, we aim to shed light on the extent to which LLMs can replicate human-like affect recognition capabilities in conversations

arXiv.org e-Print Archive

Speech-based Slot Filling using Large Language Models

Author: Feng Shutong
Gašić Milica
Jiang Dongcheng
Sun Guangzhi
Woodland Philip C.
Zhang Chao
Publication venue
Publication date: 13/11/2023
Field of study

Recently, advancements in large language models (LLMs) have shown an unprecedented ability across various language tasks. This paper investigates the potential application of LLMs to slot filling with noisy ASR transcriptions, via both in-context learning and task-specific fine-tuning. Dedicated prompt designs and fine-tuning approaches are proposed to improve the robustness of LLMs for slot filling with noisy ASR transcriptions. Moreover, a linearised knowledge injection (LKI) scheme is also proposed to integrate dynamic external knowledge into LLMs. Experiments were performed on SLURP to quantify the performance of LLMs, including GPT-3.5-turbo, GPT-4, LLaMA-13B and Vicuna-13B (v1.1 and v1.5) with different ASR error rates. The use of the proposed fine-tuning together with the LKI scheme for LLaMA-13B achieved an 8.3% absolute SLU-F1 improvement compared to the strong Flan-T5-base baseline system on a limited data setup

arXiv.org e-Print Archive

CAMELL: Confidence-based Acquisition Model for Efficient Self-supervised Active Learning with Label Validation

Author: Feng Shutong
Gašić Milica
Geishauser Christian
Heck Michael
Lin Hsien-chin
Lubis Nurul
Ruppik Benjamin
van Niekerk Carel
Vukovic Renato
Publication venue
Publication date: 13/10/2023
Field of study

Supervised neural approaches are hindered by their dependence on large, meticulously annotated datasets, a requirement that is particularly cumbersome for sequential tasks. The quality of annotations tends to deteriorate with the transition from expert-based to crowd-sourced labelling. To address these challenges, we present \textbf{CAMELL} (Confidence-based Acquisition Model for Efficient self-supervised active Learning with Label validation), a pool-based active learning framework tailored for sequential multi-output problems. CAMELL possesses three core features: (1) it requires expert annotators to label only a fraction of a chosen sequence, (2) it facilitates self-supervision for the remainder of the sequence, and (3) it employs a label validation mechanism to prevent erroneous labels from contaminating the dataset and harming model performance. We evaluate CAMELL on sequential tasks, with a special emphasis on dialogue belief tracking, a task plagued by the constraints of limited and noisy datasets. Our experiments demonstrate that CAMELL outperforms the baselines in terms of efficiency. Furthermore, the data corrections suggested by our method contribute to an overall improvement in the quality of the resulting datasets

arXiv.org e-Print Archive

ChatGPT for Zero-shot Dialogue State Tracking: A Solution or an Opportunity?

Author: Feng Shutong
Gašić Milica
Geishauser Christian
Heck Michael
Lin Hsien-Chin
Lubis Nurul
Ruppik Benjamin
van Niekerk Carel
Vukovic Renato
Publication venue
Publication date: 02/06/2023
Field of study

Recent research on dialogue state tracking (DST) focuses on methods that allow few- and zero-shot transfer to new domains or schemas. However, performance gains heavily depend on aggressive data augmentation and fine-tuning of ever larger language model based architectures. In contrast, general purpose language models, trained on large amounts of diverse data, hold the promise of solving any kind of task without task-specific training. We present preliminary experimental results on the ChatGPT research preview, showing that ChatGPT achieves state-of-the-art performance in zero-shot DST. Despite our findings, we argue that properties inherent to general purpose models limit their ability to replace specialized systems. We further theorize that the in-context learning capabilities of such models will likely become powerful tools to support the development of dedicated and dynamic dialogue state trackers.Comment: 13 pages, 3 figures, accepted at ACL 202

arXiv.org e-Print Archive

From Chatter to Matter: Addressing Critical Steps of Emotion Recognition Learning in Task-oriented Dialogue

Author: Feng Shutong
Gašić Milica
Geishauser Christian
Heck Michael
Lin Hsien-chin
Lubis Nurul
Ruppik Benjamin
van Niekerk Carel
Vukovic Renato
Publication venue
Publication date: 24/08/2023
Field of study

Emotion recognition in conversations (ERC) is a crucial task for building human-like conversational agents. While substantial efforts have been devoted to ERC for chit-chat dialogues, the task-oriented counterpart is largely left unattended. Directly applying chit-chat ERC models to task-oriented dialogues (ToDs) results in suboptimal performance as these models overlook key features such as the correlation between emotions and task completion in ToDs. In this paper, we propose a framework that turns a chit-chat ERC model into a task-oriented one, addressing three critical aspects: data, features and objective. First, we devise two ways of augmenting rare emotions to improve ERC performance. Second, we use dialogue states as auxiliary features to incorporate key information from the goal of the user. Lastly, we leverage a multi-aspect emotion definition in ToDs to devise a multi-task learning objective and a novel emotion-distance weighted loss function. Our framework yields significant improvements for a range of chit-chat ERC models on EmoWOZ, a large-scale dataset for user emotion in ToDs. We further investigate the generalisability of the best resulting model to predict user satisfaction in different ToD datasets. A comparison with supervised baselines shows a strong zero-shot capability, highlighting the potential usage of our framework in wider scenarios.Comment: Accepted by SIGDIAL 202

arXiv.org e-Print Archive

EmoUS: Simulating User Emotions in Task-Oriented Dialogues

Author: Feng Shutong
Gašić Milica
Geishauser Christian
Heck Michael
Lin Hsien-Chin
Lubis Nurul
Ruppik Benjamin
van Niekerk Carel
Vukovic Renato
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 02/06/2023
Field of study

Existing user simulators (USs) for task-oriented dialogue systems only model user behaviour on semantic and natural language levels without considering the user persona and emotions. Optimising dialogue systems with generic user policies, which cannot model diverse user behaviour driven by different emotional states, may result in a high drop-off rate when deployed in the real world. Thus, we present EmoUS, a user simulator that learns to simulate user emotions alongside user behaviour. EmoUS generates user emotions, semantic actions, and natural language responses based on the user goal, the dialogue history, and the user persona. By analysing what kind of system behaviour elicits what kind of user emotions, we show that EmoUS can be used as a probe to evaluate a variety of dialogue systems and in particular their effect on the user's emotional state. Developing such methods is important in the age of large language model chat-bots and rising ethical concerns.Comment: accepted by SIGIR202

arXiv.org e-Print Archive

The complete mitochondrial genomes of Diplonevra funebris and Diplonevra peregrina (Diptera: Phoridae)

Author: Dapeng Sun
Dianxing Feng
Shutong Dai
Publication venue: Taylor & Francis Group
Publication date: 01/03/2021
Field of study

Diplonevra is one of the most important genera in the family Phoridae. This genus is mainly distributed in Palearctic region, and its species can be used to estimate the postmortem interval. In this study, we first present two mitochondrial genomes of common necrophagous species of this genus, Diplonevra funebris (Meigen, 1830) and Diplonevra peregrina (Wiedemann, 1830). Maximum-likelihood phylogenetic tree revealed that the genus Diplonevra is closely related to the genus Dohrniphora within the family Phoridae. This work expands the knowledge about the Phoridae genomes, and contributes to the further study of species identification and phylogenetics of this family

Directory of Open Access Journals